fix(cli): #37 — real macOS LAContext biometric gate via objc2 FFI#38
Open
hanwencheng wants to merge 1 commit intomainfrom
Open
fix(cli): #37 — real macOS LAContext biometric gate via objc2 FFI#38hanwencheng wants to merge 1 commit intomainfrom
hanwencheng wants to merge 1 commit intomainfrom
Conversation
Supersedes PR #27's stub. Wires actual Touch ID / Face ID enforcement on macOS through `LAContext.evaluatePolicy`, with a trait-seam design so non-macOS platforms, tests, and the env-opt-out path are all first-class. ## Design ``` biometric/ ├── mod.rs trait BiometricBackend + require_biometric() entry point ├── error.rs typed BiometricError enum (all LAError codes mapped) ├── logic.rs pure functions: parse_la_error, redact_prompt_reason ├── lacontext.rs macOS backend — real LAContext.evaluatePolicy FFI └── stdin.rs non-macOS backend — stdin y/N fallback ``` - `AGENTKEYS_BIOMETRIC=on` activates the gate; default is bypass (preserves existing CLI behavior for scripts / CI). - `AGENTKEYS_ALLOW_NO_BIOMETRIC=1` (non-macOS) skips the stdin TTY guard. - `cmd_approve`, `cmd_revoke`, `cmd_teardown` call `require_biometric` with reason strings that are scrubbed via `redact_prompt_reason` — long opaque tokens get replaced with `<redacted>` so `revoke <session-token>` can't leak the token to terminal scrollback (codex PR #27 P2). ## unsafe inventory (~4 blocks, all with SAFETY comments) 1. `LAContext::new()` — all objc2 message-sends are unsafe 2. `canEvaluatePolicy_error` synchronous call 3. `*mut NSError` dereference inside the completion block (null-checked defensively even though Apple's contract guarantees non-null on failure) 4. `evaluatePolicy_localizedReason_reply` async method send Deadlock / leak protections: - 60-second `recv_timeout` on the channel (CLI can never hang forever) - No `Retained<LAContext>` captured inside the completion block (avoids retain cycle) - Single-shot: each `authenticate` call builds its own LAContext ## Tests — 4 layers **L1 — Pure logic (22 tests, runs everywhere):** - `parse_la_error` one test per documented NSError code - `redact_prompt_reason` preservation of 0x addresses, stripping of long tokens, idempotence - `biometric_is_enabled` env-var parsing **L2 — FFI boundary (2 tests, macOS only):** - `la_context_constructs_and_drops` — catches dylib load / linker issues - `can_evaluate_policy_is_synchronous_on_ci_runner` — smoke-test that the FFI wiring doesn't crash when called in a test context **L3 — Behavioral via MockBackend:** - `mock_backend_receives_redacted_reason` — critical regression for the P2 token-leak finding; asserts raw tokens never reach the backend - `mock_backend_returns_scripted_error` — scripted-response validation **L4 — Manual QA (`docs/manual-test-issue-37.md`):** 7 scenarios including Touch ID success/cancel, lockout → passcode fallback, device without biometry, env opt-out, token-leak regression, non-macOS stdin fallback. ## Dependencies macOS only (gated `[target.'cfg(target_os = \"macos\")'.dependencies]`): - `objc2 = \"0.6\"` - `objc2-foundation = \"0.3\"` - `objc2-local-authentication = \"0.3\"` - `block2 = \"0.6\"` Zero cost on Linux/Windows — objc2 + block2 don't enter the build graph. Test: cargo test -p agentkeys-cli -- --test-threads=1 biometric: 22 passed; cli: 25 passed Closes #37. Supersedes PR #27 (which should be closed on merge). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4 tasks
Member
Author
|
this also close #11 |
hanwencheng
added a commit
that referenced
this pull request
Apr 15, 2026
Augments the auto-generated Claude Code + Claude Code Review workflows
with context from 15+ PR review cycles in this repo so the Action
produces findings consistent with recent codex iterations instead of
generic Rust advice.
## claude-code-review.yml
- Scope `on.pull_request.paths` to `crates/**`, `docs/**`, `wiki/**`,
workflows, Cargo.toml, CLAUDE.md, and harness/. Skips cheap Cargo.lock
churn.
- `fetch-depth: 0` so Claude can inspect `git log` / `git blame` during
review (useful for "this finding predates the PR" arguments).
- `dtolnay/rust-toolchain@stable` + `Swatinem/rust-cache@v2` so every
`cargo check` / `cargo test -p <crate>` in-session runs fast.
- Custom prompt injects:
- crate names (agentkeys-types, agentkeys-core, etc)
- pointer to CLAUDE.md for architecture + mock-server design principles
- pointer to the new .github/REVIEW_GUIDELINES.md for agentkeys-specific
review patterns
- `--test-threads=1` requirement (tests mutate shared HOME/keyring)
- the 8-pattern checklist (audit-log DENIED rows, URL-encoding via
reqwest .query(), session-token redaction, case-insensitive wallet
comparison, 30-day TTL, synchronous keychain ops, path-traversal
guards, cross-wallet credential safety)
- `claude_args --allowed-tools` whitelist for cargo/git/gh so the
Action can actually run the cargo commands the prompt tells it to.
## claude.yml (@claude mentions)
- Same Rust toolchain + cache setup so `@claude run tests` /
`@claude check clippy` requests don't pay cold-compile cost.
- `fetch-depth: 0` for git-history tools.
- Same `claude_args --allowed-tools` whitelist plus `gh pr comment:*` /
`gh pr edit:*` so @claude can update PR bodies and comment back with
findings.
## .github/REVIEW_GUIDELINES.md (new)
Single source of truth for agentkeys review patterns, extracted from
PRs #18-#38 (fix/issue-10 through fix/issue-37). Documents:
- Test constraints (`--test-threads=1`, per-crate targeting)
- 10 canonical bug patterns that codex has flagged repeatedly
- Architectural invariants (master→agent single-hop; no users yet)
- Scope-control guidance (no speculative refactors, no backwards-compat
shims pre-launch)
- Policy for codex-vs-claude disagreements
Each pattern has a PR/issue reference so reviewers (and future Claude
runs) can trace why the rule exists.
hanwencheng
added a commit
that referenced
this pull request
Apr 15, 2026
* "Claude PR Assistant workflow"
* "Claude Code Review workflow"
* ci: wire agentkeys-specific context into Claude Code workflows
Augments the auto-generated Claude Code + Claude Code Review workflows
with context from 15+ PR review cycles in this repo so the Action
produces findings consistent with recent codex iterations instead of
generic Rust advice.
## claude-code-review.yml
- Scope `on.pull_request.paths` to `crates/**`, `docs/**`, `wiki/**`,
workflows, Cargo.toml, CLAUDE.md, and harness/. Skips cheap Cargo.lock
churn.
- `fetch-depth: 0` so Claude can inspect `git log` / `git blame` during
review (useful for "this finding predates the PR" arguments).
- `dtolnay/rust-toolchain@stable` + `Swatinem/rust-cache@v2` so every
`cargo check` / `cargo test -p <crate>` in-session runs fast.
- Custom prompt injects:
- crate names (agentkeys-types, agentkeys-core, etc)
- pointer to CLAUDE.md for architecture + mock-server design principles
- pointer to the new .github/REVIEW_GUIDELINES.md for agentkeys-specific
review patterns
- `--test-threads=1` requirement (tests mutate shared HOME/keyring)
- the 8-pattern checklist (audit-log DENIED rows, URL-encoding via
reqwest .query(), session-token redaction, case-insensitive wallet
comparison, 30-day TTL, synchronous keychain ops, path-traversal
guards, cross-wallet credential safety)
- `claude_args --allowed-tools` whitelist for cargo/git/gh so the
Action can actually run the cargo commands the prompt tells it to.
## claude.yml (@claude mentions)
- Same Rust toolchain + cache setup so `@claude run tests` /
`@claude check clippy` requests don't pay cold-compile cost.
- `fetch-depth: 0` for git-history tools.
- Same `claude_args --allowed-tools` whitelist plus `gh pr comment:*` /
`gh pr edit:*` so @claude can update PR bodies and comment back with
findings.
## .github/REVIEW_GUIDELINES.md (new)
Single source of truth for agentkeys review patterns, extracted from
PRs #18-#38 (fix/issue-10 through fix/issue-37). Documents:
- Test constraints (`--test-threads=1`, per-crate targeting)
- 10 canonical bug patterns that codex has flagged repeatedly
- Architectural invariants (master→agent single-hop; no users yet)
- Scope-control guidance (no speculative refactors, no backwards-compat
shims pre-launch)
- Policy for codex-vs-claude disagreements
Each pattern has a PR/issue reference so reviewers (and future Claude
runs) can trace why the rule exists.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wires real Touch ID / Face ID on macOS via
LAContext.evaluatePolicythrough the objc2 + block2 FFI stack. Supersedes PR #27, which landed a stub that logged a prompt and returnedOk(())on macOS.Closes #37.
Design — trait seam
AGENTKEYS_BIOMETRIC=onactivates the gate; default is bypass (preserves existing CLI behavior for scripts / CI).AGENTKEYS_ALLOW_NO_BIOMETRIC=1(non-macOS only) skips the stdin TTY guard.cmd_approve,cmd_revoke,cmd_teardowncallrequire_biometricwith reason strings scrubbed viaredact_prompt_reason— long opaque tokens get replaced with<redacted>sorevoke <session-token>can't leak the token to terminal scrollback (the critical P2 codex flagged on PR fix(cli): #11 biometric gate for high-security master CLI actions (macOS) #27).unsafe inventory
~4 blocks, each 2-5 lines, all annotated with
// SAFETY:comments:LAContext::new()— all objc2 message-sends are unsafecanEvaluatePolicy_errorcapability check*mut NSErrordereference inside the completion block (null-checked defensively even though Apple's contract guarantees non-null on failure)evaluatePolicy_localizedReason_replyasync method sendDeadlock / leak protections:
recv_timeouton the channel — CLI can never hang foreverRetained<LAContext>captured inside the completion block (avoids retain cycle)authenticatecall builds its own LAContextTests — 4 layers
L1 — Pure logic (22 tests, runs on every platform)
parse_la_error— one test per documented LAError code (-1 through -10)redact_prompt_reason— 0x-address preservation, long-token stripping, idempotencebiometric_is_enabled— env-var parsing (on/1/true/ unset /off)L2 — FFI boundary (2 tests,
#[cfg(target_os = "macos")])la_context_constructs_and_drops— catches dylib load / linker issuescan_evaluate_policy_is_synchronous_on_ci_runner— smoke-test for FFI wiringL3 — Behavioral via MockBackend
mock_backend_receives_redacted_reason— critical regression test for PR fix(cli): #11 biometric gate for high-security master CLI actions (macOS) #27's P2 token-leak findingmock_backend_returns_scripted_error— scripted-response validationL4 — Manual QA (
docs/manual-test-issue-37.md)7 scenarios requiring real macOS Touch ID hardware — cannot be automated:
AGENTKEYS_BIOMETRIC=offescape hatchrevoke <raw-token>must NOT echo the token to stderr / scrollbackDependencies
Gated with
[target.'cfg(target_os = "macos")'.dependencies]— zero cost on Linux / Windows builds:Test plan
cargo test -p agentkeys-cli biometric -- --test-threads=1→ 22 passed (L1 + L3 + L2 on macOS)cargo test -p agentkeys-cli -- --test-threads=1→ 25 existing CLI integration tests still green with gate default-offcargo check -p agentkeys-cli→ clean compile on macOS with full FFI stack linkeddocs/manual-test-issue-37.md(requires reviewer with Touch ID hardware)Issue
Closes #37. Supersedes #27 — PR #27 should be closed when this merges (the trait-seam design replaces its
biometric.rsentirely).